An analytical model of high performance superscalar-based multiprocessors
نویسندگان
چکیده
Several shared memory multiprocessor models using approximate Mean Value Analysis (MVA) have been developed and used to evaluate a number of system architectures. Since this time, the complexity of multiprocessor systems has increased as superscalar processors and latency reduction techniques are employed in these systems. We present an MVA multiprocessor performance model which incorporates these new features and in addition, increases the level of modeling detail to improve exibility and accuracy. We describe in detail extensions present in our model that allow us to analyze the impact of these new features. We then use the model to demonstrate some of the tradeoos involved in designing modern multiprocessors, including the impact of highly su-perscalar architectures on the scalability of multiprocessor systems. 1 Introduction An analytical modeling technique that has been frequently used to evaluate shared memory multiprocessors is approximate Mean Value Analysis (MVA))12]. In MVA, a set of equations that represent the mean response times and mean waiting times of various performance elements are derived using the mean values of various system parameters as model inputs. For example, a simple multiprocessor MVA model may include the mean response time of a cache miss as a response time equation, the mean bus waiting time as a waiting time equation, and the mean time between cache misses as a model input. Once constructed, these equations have circular dependencies and must be solved iteratively. The power of MVA modeling lies in its computation ee-ciency. Convergence is usually achieved within a second, independent of the number of processors and memories in the model. Many models of multiprocessor systems have been created using this technique and have been used to evaluate the performance of both research prototypee9, 13] and commerciall10] multiprocessors. The results obtained from approximate MVA models have been shownn5] to correlate well with those obtained from trace-driven simulation models. Since these models have been constructed, superscalar
منابع مشابه
Running Parallel Applications on an Mp with Multithreaded Superscalar Processors Running Parallel Applications on a Mp with Multithreaded Superscalar Processors
With lesser returns on adding more complexity to conventional superscalar processors, simultaneous multithreaded (SMT) superscalar processors seem to be a promising alternative. Unfortunately, most previous work has focused on systems running multiprogrammed loads of sequential applications. It is not clear how well these processors work in a shared-memory multiprocessor environment running par...
متن کاملA Mean Value Analysis Multiprocessor Model Incorporating
Several approximate Mean Value Analysis (MVA) shared memory multiprocessor models have been developed and used to evaluate a number of system architectures. In recent years, the use of superscalar processors, multilevel cache hierarchies, and latency tolerating techniques has signi cantly increased the complexity of multiprocessor system modeling. We present an analytical performance model whic...
متن کاملA Mean Value Analysis Multiprocessor Model Incorporating Superscalar Processors and Latency Tolerating Techniques 1
Several approximate Mean Value Analysis (MVA) shared memory multipro-cessor models have been developed and used to evaluate a number of system architectures. In recent years, the use of superscalar processors, multilevel cache hierarchies, and latency tolerating techniques has signiicantly increased the complexity of multiprocessor system modeling. We present an analytical performance model whi...
متن کاملEffective Instruction Prefetching In Chip Multiprocessors
threaded application performance, often achieved through instruction level parallelism per chip is increasing, the software and hardware techniques to exploit the potential of studies mostly involve distributed shared memory multiprocessors and fetching will not be fully effective at masking the remote fetch latency. the effective address of the load instructions along that path based upon a hi...
متن کاملA New Synchronization Scheme for Memory Consistency Model ( Extended Abstract )
Modernistic scalable multiprocessors are mostly built with a distributed-shared memory architecture. Large scale shared memory multiprocessors have long memory latencies for the remote memory access. And these latencies can quickly offset system performance earned from the exploitation of parallelism. In order to improve system performance, we must reduce memory latencies. The useful way for th...
متن کامل